Skip to content

Conversation

@A-Marcelli
Copy link

Added FP8 support for conversion and arithmetic operations following the template of the softfloat libraries. The operations are FULLY verified.

FP8 formats are both IEEE-754-like and OCP (NVIDIA), selectable with a macro in the softfloat_types.h header.

Copy link
Contributor

@nibrunieAtSi5 nibrunieAtSi5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a very long PR (193 modified files).
It would be good to write a larger description and some more details about what "The operations are FULLY verified" actually means.

What is the difference between f8_1 and f8_2 (my apologies if this is documented somewhere) ?

Comment on lines +33 to +37
#if E4M3_OFP8 == 1
uint_fast16_t infOrNaN = 0;
#else
uint_fast16_t infOrNaN = expF8_1UI( uiA ) == 0x0F;
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why a macro to distinguish between OFP8 types ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#define E4M3_OFP8          1 //When set to 1, the FP8 will be the OFP/Nvidia one.              	When set to 0, it will be the ieee-like one.

(from softfloat_types.h).
What does it will be the ieee-like one. actually mean ? binary8 formats from P3109 ? which one ?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can focus on cherry picking the commits from @A-Marcelli and ignore most of my commits if not all, because they are more related to the integration of the newly added softfloat_8 libs with our fork of Spike and its integration as a shared ".so" library with our SoC we work on at our Lab:
https://github.com/klessydra/pulpino-klessydra

Here you can find the vfpu.c ( https://github.com/klessydra/spike-with-minifloat-fp8-support/tree/spike-fp8/vfpu ) test we compile with LLVM and run it with spike as baremetal (without pk) and we use it to verify some of the aritmetic operations.

While on the other hand for a more comprehensive verification you can use this other repo to verify the full list of instructions for f8_1 (e4m3) and f8_2 (e5m2) ( https://github.com/klessydra/generic_float_calculator/tree/main/reference_model ) and build the reference model inside and see the entire list of ops and results. The repo just takes the softfloat libs plus the ones we made and puts them in this gold model.

Below are a set of screenshots from the reference model. Just note that they have been verified against an FPU that supports both 8-bit float types and they all passed (after an extensive two year-long effort of verifying all corner cases).

immagine immagine immagine

@A-Marcelli
Copy link
Author

Sorry for the too short description in the first message.

I started this branching before the OCP standard came out, when the FP8 formats were still undocumented and present in two different types, a IEEE-754-like, where the FP8 E4M3 has infinities, NaNs, etc like a normal IEEE-754, and the NVIDIA/INTEL/ARM one that later became the OCP standard's one. Both are supported in different architectures. For example the e4m3 IEEE-754-like is supported in the Intel FP8-Emulation-Toolkit (here on Github). The P3109 special interest group was not present at the time, so the formats present here in the pull request are not the one of P3109, but are e4m3 and e5m2 following standard IEEE-754 template for greater precision formats.

The e5m2 is identical across the standards, so the only difference is the e4m3. In this code i support both by using macros.
Moreover, since the OCP specifications only deals with conversions, a couple of extra macro were necessary to decide the behaviour of OCP arithmetic results when dealing with the saturations and of the only e4m3 NaN, that technically is a silent one, but can be made signaling with the macro.

So sorry for the confusing name but "E4M3_OFP8" is not to distinguish between OCP's e4m3 and e5m2, but instead to distinguish between OCP's e4m3 and IEEE-754-like e4m3.

In all the files, for fp8_1 i intend the e4m3 format, and for fp8_2 the e5m2, sorry for the confusing name, again.

To verify all the results, we built a separate calculator that execute all possible combination of instructions and inputs (softfloat 8 bit only obviously) in greater precision, then round to fp8, and confront the results. Moreover, this fp8 library was used to verify some big external hardware 8-bit floating point unit.

If you want to read more about it, you can find the full description and more at this paper: https://www.mdpi.com/3526480 . I hope there are no rules against sending links, in case sorry.

The present arithmetic functions covers all the ones required by IEEE-754 formats.

If you have any more questions or i was not clear, i'll be happy to respond. Thank you for reviewing this pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants